Download Bayesian Identification of Closely-Spaced Chords from Single-Frame STFT Peaks
Identifying chords and related musical attributes from digital audio has proven a long-standing problem spanning many decades of research. A robust identification may facilitate automatic transcription, semantic indexing, polyphonic source separation and other emerging applications. To this end, we develop a Bayesian inference engine operating on single-frame STFT peaks. Peak likelihoods conditional on pitch component information are evaluated by an MCMC approach accounting for overlapping harmonics as well as undetected/spurious peaks, thus facilitating operation in noisy environments at very low computational cost. Our inference engine evaluates posterior probabilities of musical attributes such as root, chroma (including inversion), octave and tuning, given STFT peak frequency and amplitude observations. The resultant posteriors become highly concentrated around the correct attributes, as demonstrated using 227 ms piano recordings with −10 dB additive white Gaussian noise.
Download Application of Raster Scanning Method to Image Sonification, Sound Visualization, Sound Analysis and Synthesis
Raster scanning is a technique for generating or recording a video image by means of a line-by-line sweep, tantamount to a data mapping scheme between one and two dimensional spaces. While this geometric structure has been widely used on many data transmission and storage systems as well as most video displaying and capturing devices, its application to audio related research or art is rare. In this paper, a data mapping mechanism of raster scanning is proposed as a framework for both image sonification and sound visualization. This mechanism is simple, and produces compelling results when used for sonifying image texture and visualizing sound timbre. In addition to its potential as a cross modal representation, its complementary and analogous property can be applied sequentially to create a chain of sonifications and visualizations using digital filters, thus suggesting a useful creative method of audio processing. Special attention is paid to the rastrogram - raster visualization of sound - as an intuitive visual interface to audio data. In addition to being an efficient means of sound representation that provides meaningful display of significant auditory features, the rastrogram is applied to the area of sound analysis by visualizing characteristics of loop filters used for a Karplus-Strong model. A new sound synthesis method based on texture analysis/synthesis of the rastrogram is also suggested.